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Construction and impromptu repair of an MST in a distributed network 

with o{m) communication 
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Abstract 

In the CONGEST model, a communications network is an undirected graph whose n nodes are 
processors and whose m edges are the communications links between processors. At any given time 
step, a message of size O(logn) may be sent by each node to each of its neighbours. We show for the 
synchronous model: If all nodes start in the same round, and each node knows its ID and the ID’s of its 
neighbors, or in the case of MST, the distinct weights of its incident edges and knows n, then there are 
Monte Carlo algorithms which succeed w.h.p. to determine a minimum spanning forest (MST) and a 
spanning forest (ST) using 0(n log^ n/ log log n) messages for MST and O(nlogn) messages for ST, resp. 
These results contradict the “folk theorem” noted in Awerbuch, et.ak, JACM 1990 that the distributed 
construction of a broadcast tree requires 17 (to) messages. This lower bound has been shown there and 
in other papers for some CONGEST models; our protocol demonstrates the limits of these models. 

A dynamic distributed network is one which undergoes online edge insertions or deletions. We also 
show how to repair an MST or ST in a dynamic network with asynchronous communication. An edge 
deletion can be processed in 0(n log n/log log n) expected messages in the MST, and 0{n) expected 
messages for the ST problem, while an edge insertion uses 0{n) messages in the worst case. We call this 
“impromptu” updating as we assume that between processing of edge updates there is no preprocessing 
or storage of additional information. Previous algorithms for this problem that use an amortized o(m) 
messages per update require substantial preprocessing and additional local storage between updates. 
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1 Introduction 


The problem of finding a minimum spanning forest (MST) or computing a spanning forest (ST) in a 
communications network is one of the most fundamental and heavily studied problems in distributed 
computing. This problem is important for facilitating broadcast and coordination in a message-efficient 
manner. Given such a tree, messages may be broadcast from one node to all others or values from all nodes 
can be combined from the leaves up to one node in time proportional to the diameter of the tree, with a 
number of messages which is proportional to the size of the tree, rather than all edges in the network, as 
when communication is by flooding. For this reason, a tree is useful for tasks such as leader election, mutual 
exclusion, and reset (adaptation of any static algorithm to changes in the network topology). Below, we 
consider a network to be a graph with n nodes and m edges. 

In 1983, Gallager, Humlet and Spira gave a now classic algorithm for finding a MST in a distributed 
asynchronous communications network with message complexity 0(m-|-n log n) for a network with n nodes 
and m edges. The message complexity for this problem has not been improved until now, even for the 
easier context we consider here (a synchronous network with nodes initialized to start the algorithm at 
the same time). For the unweighted problem of ST, a single node starting a flooding algorithm can can 
construct a broadcast tree 0{m) messages in time equal to the diameter of the network (see, e.g. [32]). 

That n(m) messages are required for broadcast (and the construction of an ST) is mentioned as 
“folklore” by Awerbuch, Goldreich, Peleg and Vainish (1990) |1|. They prove this lower bound in what is 
referred to as the “standard” KTi model, where each node knows (its own identity) and the identity of its 
neighbors. The n(m) lower bound holds for randomized (Monte Carlo) comparison protocols, where the 
basic computation step is to compare two processors’ identities, and for general algorithms where the set 
of ID’s is very large and grows independently with respect to message size, time and randomness. In 2013 
Kutten, Pandurangan, Peleg, Robinson, and Trehan showed an D(m) lower bound for randomized general 
algorithms in the KTq model, where each node does not know the identities of its neighbors m- All these 
lower bounds hold when the size of the network is known to all the nodes, the network is synchronous, and 
all the nodes start Simultaneously. Our MST and ST algorithms avoid both lower bounds by assuming 
KTi and an exponential bound on the size of the identity space. 

Communication networks are inherently dynamic, in that a link may be either deleted or inserted 
over time. This paper also presents algorithms to repair an MST or ST in an asynchronous network 
upon an edge insertion or deletion. These algorithms have the new property (for an efficient dynamic 
graph algorithm) of being “impromptu”, that is, they require no preprocessing or storage of auxiliary 
information except during the processing of the current updates. Between updates, a node knows only the 
names and weights of its incident edges and whether these edges are in the currently maintained MST or 
ST. While there are previously known algorithms for updating MST and ST with 0{n) messages, these 
have significant memory requirements and require the communication costs to be amortized over a sequence 
of sufficiently long updates. For example, the 2008 algorithm of Awerbuch et. al. |5| to maintain an MST 
uses 0(n} amortized messages per update (somewhat better then our second algorithm), but stores and 
stores ©(A^nlogre) bits at each node v, where A„ is the number of node u’s neighbors 0. 

We first describe the model and then the results. A communications network is a graph. We assume 
that every node knows a bound n on the actual number of nodes. An interesting case is when the known 
upper bound on the size of the network is very tight (e.g the actual size multiplied by some small positive 
integer constant). In this case, all our asymptotic results are in terms of the actual network size. Hence for 
simplicity, we refer below to n as the network size (rather than an upper bound). The communication links 
are undirected edges and each node has a unique ID € {1, 2,.., n*^}. In fact, using the classic Karp-Rabin 
m fingerprinting, w.h.p., we can easily map n ID’s in exponential ID space to distinct ID’s in polynomial 
ID space. For the MST problem, each edge has a weight € {1,2,...,?/} for any positive integer u. Each 
node knows its own ID, the weight of each incident edge, the ID of its other endpoint, and re. No other 

^ According to , “keeping track of history enables significant improvements in the communication complexity of dynamic 
networks protocols.” Atop its abstract, this was also stated in a more lyrical way: “Those who cannot remember the past are 
condemned to repeat it (George Santayana).” A part of the message of the current paper may be adding “unless they flip 
coins...” Here we show that history can be replaced by random coin tosses. 
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information about the graph is known to any node at the start of any of our algorithms. A message is a 
communication of 0(log(n + u)) bits which is passed along a single edge. 

A network is properly marked if every edge is marked by both or neither of its endpoints. A tree T is 
maintained by a network if the network is properly marked and T is a maximal tree in the subgraph of 
marked edges. For a node x, let denote the tree maintained by the network and containing node x. We 
call an (unmarked) edge with exactly one endpoint in T an edge leaving T or outgoing. A tree construction 
problem assumes that initially all edges are unmarked and every node knows to begin construction. At 
the end of the algorithm the network should maintain the MST (or ST). We use the usual dehnitions of 
synchrony and asynchrony: A synchronized network assumes a global clock, and messages are received 
in one time step. An asynchronous network assumes that messages are eventually received. Each node’s 
action is triggered by the receiving of a message or other change to its state. We say an event occurs 
“w.h.p.” (with high probability) if for any constant c (which is given as a parameter of the algorithm), the 
probability of the event is at least 1 — n~‘^. We show: 

Theorem 1.1. There are algorithms to construct a minimum spanning tree (MST) and spanning tree (ST) 
succeeding w.h.p. in a synchronous networks of n nodes using time and messages 0(nlog^n/loglogn) for 
MST and 0{n\ogn) for ST, and 0(log{n + u)) local memory per node. This assumes that each node is 
initialized to start the algorithm and its only initial knowledge of the graph is its ID, its neighbors’ ID’s, 
the weight of each of its incident edge, and n. 

Theorem 1.2. Upon deletion or increase in weight of an edge, there are algorithms FindAny and Find- 
Min to repair an ST and an MST, respectively, which find a replacement edge if there is any, in an asyn¬ 
chronous distributed network using expeced time and messages 0{n) for the ST, and 0(n log n/log log n) 
for the MST, and 0(log(n + u)) local memory per node. Upon insertion or decrease in weight of an edge, a 
deterministic algorithm with 0{n) time and messages suffices to repair the tree. All repairs are impromptu, 
i.e, no preprocessing or extra storage is needed between updates. This assumes each node knows its ID and 
the ID’s of their neighbors and the weight of each incident edge. To achieve success with probability l — n~^, 
each node must know an upper bound on n which is within a polynomial ofn. 

Modihed versions FindAny — c and FindMin — c are also presented. Their worst case cost matches 
the expected cost of FindAny and FindMin. When there is a replacement for a deleted tree edge, w.h.p., 
they return either a correct replacement edge or 0, and with constant probability, they return the former. 

Our algorithms are based on the following new procedures which may be useful in other contexts. 
Below, node x initiates the procedure and receives the output, and j, k G {1,.., u}: 

• TestOut(x, j. A:): Returns true with constant probability if there is an edge leaving T^ with edge 
weight in the interval [j,k]-, false otherwise. Always correct if true is returned. 

• HP-TestOut(x, j. A:): The same as TestOut but w.h.p. 

A basic communication step in our network is a simple distributed routine broadcast-and-echo [13]. It 
is initiated by the broadcast of a message by a node x which becomes the “root” of a tree. When a node v 
receives a broadcast message from its neighbor y, it designates y as its “parent” (for the sake of the current 
communication step) and sends a broadcast message to each of its other neighbors in T, its “children”. 
When a leaf node in T receives a broadcast message, it sends a message (“echo”) to its parent, possibly 
carrying some value. When a non-leaf message has received an echo message from every child, it sends an 
echo message to its parent, possibly aggregating its value with the values sent by its children. When the 
root has received echo messages from all its children, the broadcast-and-echo is done. We show: 

Lemma 1. TestOut and HP — TestOut can be performed with one broadcast-and-echo with message size 
0{log{n + u)). The echo of TestOut requires only a message of only one bit. 

Other previous work 

Similar techniques: TestOut uses the principle that each edge with two endpoints in a tree contributes 0 to 
the parity of the sum of the degrees of the nodes in a tree, while each edge which leaves a tree contributes 
1. Therefore, in a randomly sampled graph, there is a 1/2 chance that if there are one or more edges 
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leaving a tree, the parity of the sum of the degrees is odd. This observation is used in a paper on graph 
sketching and a paper on sequential dynamic graph connectivity data structures m- It is not clear 
how to adapt the techniques of [6] to the distributed setting. Those of [15] were adapted to a distributed 
version |26|, however, it was not impromptu (required keeping supplemental storage between updates) and 
did not address an MST (and was much more complicated that the repair algorithm presented here). 

MST and ST construction: The complexity of the first distributed MST construction algorithm was not 
analyzed [8] . Following the seminal paper of m mentioned above, Awerbuch improved the time complexity 
to 0(n) |2], retaining the same message complexity of 0{m + nlogn). Distributed algorithms that are 
faster (when the diameter of the network or the diameter of the MST are smaller) do exist [2Ul fT^ ITT] . 
However, their message and memory space complexities are much higher. 

Simultaneous edge ehanges: As opposed to the previous o{m) (but non impromptu) repair algorithms 
OEI], ours has not been extended to deal with multiple updates at a time, though we believe it can be. d 

Definitions and Organization 

Definitions: An edge {u, u}’s edge number is the concatenation of the unique ID’s of the edge’s endpoints, 
smallest first. We create unique weights (as in [T3]) by concatenating the weight to the front of its edge 
number. For any tree, maxID(T), maxEdgeNum(T), and maxWt(T) denote the maximum ID of any node 
in T, the maximum edge number, and maximum weight edge, resp. of any node in T. T is omitted where 
it is understood from context. Let [j, k] denote the set of integers {j, j + 1,...,, k} and Igre denote log 2 n. 

Organization: The functions TestOut and HP — TestOut are described in Section [2j Section [3] describes 
FindMin, an algorithm for dynamic MST and an algorithm for constructing an MST 13.31 Section 14.11 
describes Find Any and reduces the complexity for construction and repair for ST. The Appendix contains 
an extension to the case where edge weight may be superpolynomial in n, and some deferred proof. 


2 TestOut 

In this section, we describe TestOut and HP — TestOut. 


2.1 Random odd hash functions and TestOut 

As a method to sample edges, we use the concept of an odd hash functions: We say that a random hash 
function h : [1, m] —)• {0,1} is e-odd, if for any given non-empty set S C [1, m], there are an odd number of 
elements in S which hash to 1 with probability e, that is. 


Pr 

h 


1 

-xeS 


mod 2 


> £■ 


( 1 ) 


An odd hash function is a type of “distinguisher” described in |33] : we use their construction her^. 
Let m < 2"’. We pick a uniform odd multiplier a from [1,2"’] and a uniform threshold t G [1,2"']. From 
these two components, we define h : [1, 2"'] —>■ {0,1} as 


h{x) = 1 if (ax mod 2"') < t 

= 0 otherwise. 


The above is particularly efficient if tc G {8,32,64} in a programming language like C, for there the mod- 
operation comes for free as part of an integer multiplication which automatically discards overflow beyond 
the w bits. From |33| we see that h is an (l/8)-odd hash function. 

Let h : [l,maxEdgeNum] {0,1} be an odd hash function. We show how to compute TestOut. Let 
E(v) denote the edge numbers of edges incident to node v. Let Cut(T, V\T) denote the set of edges with 

^The idea behind such an extension would be, essentially, to use the algorithm of Awerbuch et. al of 2008, but replace 
their method of finding replacement edges with the method used here. 

^|33| was originally inspired by the developments in the current paper. 
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exactly one endpoint in T. To test with constant probability whether there exists any edge leaving a tree 
T, each node v with E{v) ^ 0 computes 

h{e) mod 2 

e^E{v) 

locally. If E{v) = 0, then 0 is returned. These values are aggregated over the nodes in T to compute 

h{e) mod 2 = h{e) mod 2 

v£V{T) e£E{v) e£Cut{T,V\T) 

TestOut{x) can be done with one broadcast of h from node x and one 1-bit echo. First x broadcasts 
h in one message. The leaves return the parity of their sum to their parent; the parent passes to its own 
parent the sum mod 2 of its children and of its own sums. 

TestOut(x, j, k) checks if there is any edge leaving Tx whose weight is in a given interval [j,k]. To do 
so, in each local computation at node v, the dehnitions above for TestOut[x) are changed so that 

yy h{e) mod 2 is replaced by h{e) mod 2 

e£E(v) e£E{v)Aweight(e)£[i,k]) 

2.2 High probability TestOut 

TestOut achieves a constant probability of correctness if the set is non-empty and is always correct if it is 
empty. The probability can be amplihed to high probability by O(logn) independent parallel repetitions. 
However this would require O(log^n) bits. Alternatively, deterministic amplification methods can bring 
down the randomness to O(logn). However, this is not simple and would require each node to construct 
a portion of an averaging sampler. 

Instead, we take advantage of the type of set we are looking at and introduce a high probability version 
of TestOut. W.h.p., HP — TestOut{x) outputs 1 if the tree Tx has any leaving edge. If there is no such 
edge, it always returns 0. 

For a vertex n, let E'^{u) = {(n, n) G E} and E^{u) = {{v,u) G E}. For the tree T, E'^{T) = E^{u) 

and E^{T) = [j^^^EHu). 

Observation 1. There is an edge {n, n} G E with only one endpoint in T if and only if E'^{T) ^ E^(T). 

Thus, to implement HP — TestOut, we need only test if E^{T) ^ E^{T). To test set equality efficiently, 
we use a method from [7] based on the Schwartz-Zippel m polynomial identity testing. Let B be the 
number of edges incident to nodes in T. To achieve probability of error e(n), it suffices to use any prime 
p > ma,x{maxEdgeNum{T), B/e{n)}, with \p\ < w, the maximum message size. We note that if w is 
sufficiently large and known to all nodes, we may take p to be the maximum prime p with \p\ < w or have 
some other predetermined value for p. For p and an edge set D, we dehne a polynomial over Zp by 

p(D)(z)=n {z — edge-number(e)) mod p. 

e&D 

From[7] : Pr [V{E'^{T)){a) = V{E^{T)){a)] < e(n). (2) 

HP — TestOut{x):{assnines x knows e(n)} 

0) If p is not known by all nodes, then x initiates a Broadcast — and — echo to find maxEdgeNum, B (by 
summing up the degrees of nodes in T), and using these, determines p. 
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1) X initiates a Broadcast — and — echo in which a randomly selected a (and p if necessary) is passed 
to all nodes in the tree in the broadcast phase. Each node y locally computes Local^{y) = V{E^{y)){a) 
and Local^{y) = 'P{E^{y)){a). Upon receiving 'P{E^{Tz)){a) and V{E^{Tz)){a) from each of its children 
z, each node computes and sends to its parent 

ViE^iT{y))ia) = Local\y)* V{E\TM») 

z child of y 

and 'P{E^{T{y)){a) = Local^{y) * n nEHn))(a) 

z child of y 

3) X determines there is an edge leaving T iff V{E^{T)){a) / V{E^{T)){q). 

Analysis: As all computations are over Zp, the number of messages sent is < 4|T| with each containing 
IpI = 0{\og{maxEdgeNum + B)) = O(logn) bits. 

HP — TestOut{x,j, k) checks if there is any edge leaving whose weight is in a given interval [j, k]. To 
do so, in each local computation at node v, the definitions above ioi HP — TestOut{x) are changed so that 
= {(u,u) G E} andE'*'(tt) = G E} are replaced by = {(u, u) G EAweight{u,v) G [j,k]} 

and E^{u) = {(u,u) G E A weight{v,u) G [j, A:]}. 

3 MST Build and Repair 

We present a simple method to find the lightest leaving edge using a tc-wise search on the edge weights. 
This yields a method using O(log n/log log n) broadcast-and-echoes with w = O(logn) bit messages when 
the weight of every edge is polynomial. In the appendix, we give a more complicated method which uses 
0(logn/loglogn) for superpolynomial edge weights of size u which assumes wordsize 0(log(n + u)). 

3.1 Integer edge weights of polynomial size 

Since TestOut uses a single bit “echo”, a single broadcast — and — echo can test w = O(logn) subranges 
concurrently, as the same hash function can be used for each of the parallel TestOufs, while the single 
bit responses for each subrange TestOufs are returned concurrently in one word. The smallest subrange 
testing positive becomes the next range of edge weights to be tested. Before narrowing the range, the 
result is verified w.h.p. using HP — TestOut. 

Below we describe FindMin and EindMin-C. FindMin-C is like EindMin except that the number 
of repetitions of the loop in the algorithm is limited to double the expected number, 0(logn/loglogn), 
rather than O(logn) in the worst case. Let q be the probability that TestOut succeeds. We assume for 
any constant c, x knows a polynomial bound on the network size n in order to set an error parameter for 
HP — TestOut, e(n) < n~^~^ such that e(n)“^ is polynomial in n and a bound for Count for FindMin 
which exceeds {c/q) Ign and is O(logn). 

FindMin{x) [FindMin — C] {finds minimum cost edge in [Tx, V \ Tx)} 

1. Count A- 0. 

2. X determines maxWt{Tx) and maxEdgeNum(Tx) through one broadcast-and-echo and computes 
e(n). 

3. X sets j V, k A- maxWt{Tx) 

4. X broadcasts an odd hash function / : \l,maxEdgeNum{Tx)] -A {0,1} and also j and k. 

5. In parallel for z = 0,1, 2, ...,w — 1: 

set ji = j + i\{k - j)/w'] and h = j + {i + l)\k - j)/w] - 1, 
return word in which bit is the “echo” of TestOut{x,ji,ki) 

6. Upon receiving the echo, x determines the index min = min{i | TestOut{x,ji,ki) = 1) and 
initiates Test Low = HP — TestOut{x, 0, jmm —1) andTestInterval = HP — Te.stOut{x, jmin, kmin)- 

7. Upon receiving results, 
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(a) if Test Low = 0 and TestHigh = 1 

and if jmin < kmin then x sets j = jmin and k = kmin + 1; else if jmin = kmin X broadcasts 
“stop” and returns jmin- 

(b) else if both return 0, x broadcasts “stop” and returns 0. 

8. For FindMin [resp., FindMin — C]: If Count < {c/q)\gn + {c/q){].gmaxWt{Tx) / \gw, [resp., 
Count < {2c/q)\gmaxWt{Tx)/ \gw], increment Count and repeat from Step 01 Else return 0. 

Proof of correctness 

Lemma 2. Let c he any constant s.t. c > 1. With probability 1 —using asynchronous communication, 
FindMin{x) returns the lightest edge leaving a tree Tx in expected time and messages 0{\Tx\ logn/loglogn) 
(and worst case O(logn) time and messages. With probability 2/3 —1/n'^ FindMin —C{x) returns the light¬ 
est edge and with probability l — n~‘^ it returns the lightest edge or 0, using worst case 0{\Tx \ log n/ log logn) 
messages and time. If there is no edge leaving the tree, both procedures always return 0. This assumes x 
knows an upper bound on the size n of the network which is polynomial in n. 

Proof. FindMin is analyzed first. We observe that HP — TestOut is always successfully, then FindMin 
will terminate successfully after no more than lgmaxWt/lg{w — 1) successful executions of TestOut: 
Let I = {ji,ki) be the first interval containing an edge leaving Tx. TestOut always returns a 0 for 
earlier intervals, and returns a 1 with constant probability q = 1/8 when I is tested. If TestOut fails to 
return a 1 for interval I, then TestLow will detect a 1 and the loop is repeated; otherwise the range is 
successfully narrowed. The range is narrowed no more than lgmaxWt/\g{w — 1) times. Each successful 
narrowing requires an expected 1/q repetitions and overall, in expectation {1 /q) Ig maxWt/ lg{w — 1) = 
0(logre/logrelogn) iterations of Steps 4-8 suffice to return the lightest edge leaving T (if such exists). 

We bound the probability that TestOut fails K times before succeeding lgmaxWt/lg{w — 1) = 
0(logre/loglogre) times, where K = {c/q)lgn: This is given by a tail bound on a random variable with a 
binomial distribution with K + lgmaxWt/lg{w — 1) trials and constant probability q of heads (success). 
Using Chernoff bounds, the probability of this type of failure is < l/(2n'^) for sufficiently large n. 

We now bound the probability that HP — TestOut fails at least once after any of these calls to TestOut: 
With an error parameter of < for HP — TestOut, the probability of the latter over 2K = 2{c/q) Ign 

trials is less than l/(2n'^) by a union bound. 

We conclude that the probability of either event occurring is less than 1/n'^, again by a union bound. 
Hence, w.h.p., the range is successfully narrowed to the minimum weight edge after 2{c/q) logn iterations 
of Steps 4-8 or, if there is no edge leaving T, then Step 7(b) is executed and 0 is returned. 

Eor FindMin — C, TestOut is restricted to make only K' = {2c/q) Ig maxWt/ lg{w — 1) repetitions. 
Eor FindMin — C to return the lightest edge, TestOut cannot fail more than K' times before achieving 
Ig maxWt/ lg(r(; — 1) successes and HP — TestOut cannot fail once in 2{K' + 2 lgmaxWt/lg{w — 1) trials. 
The probability that the number of TestOut trials needed exceeds the expected number by a factor of 2c/q+ 
1 is less than 1/3 for c > 1, by Markov’s Inequality. We next bound the probability that HP — TestOut 
fails during any one of the 2K' -P 2lgmaxWt/lg{w — 1) trials and repetitions. Since the error parameter 
for HP — TestOut is n~^~^, the union bound over all these < n trials gives a probability of this type of 
error of 1/n^. Eor HP — TestOut to return an incorrect lightest edge, HP — TestOut must fail at least 
once. Hence, if there is an edge leaving then with probability 2/3 — 1/n'^, TestOut — C returns the correct 
edge, with probability 1 — 1/n^ it returns 0 or the correct edge, and with probability < 1/n^ it returns an 
edge leaving the tree which is not the lightest edge. 


3.2 Impromptu repairs of MST 

We now apply FindMin to the problem of repairing an MST. Assume that the updates are well-separated 
in the sense that we can complete the processing of an edge update before the next one arrives. Before any 
update, assume the network maintains a minimum spanning forest, and each node knows some polynomial 
(in n) upper bound on the size 

“^This is only required to compute with probability of error a function of n 
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Delete(n, u). When an edge {u,v} is deleted, !£«<?;, then if {u,v} was in the MST, then u initiates 
FindMin in the marked subtree containing u, T„. If FindMin returns 0, it means that {tt, u} was a bridge, 
and we are done. Otherwise FindMin returns an edge Then u broadcasts that {u',v'} should 

be added to the minimum spanning forest, and u' forwards this message to v'. Both u' and v' mark the 
edge {u',v}. The bottleneck of the complexity is the call FindMin{u) which uses 0(n„ log n) messages 
for Uu < n nodes in T^. 

Insert (tt, v). When an edge {u, u} is inserted, and u < v, u determines if its tree in the MST contains 
V and if so, it determines the heaviest edge e on the path from u to v. This is easily done by a broadcast- 
and-echo from u. If e is heavier than {u,v}, {u,v} is included in the minimum spanning forest, and u 
broadcasts that e should be removed from the MST. A constant number of broadcast-and-echoes are used, 
for a total number of messages which is proportional to the size of T„. 

The analysis of these operations follow from Lemma [2j With the extension of FindMin to superpoly- 
nomial edge weights given in the Appendix, the proof of Theorem 11.21 follows. 

3.3 Building an MST 

In a synchronous network, building an MST from scratch is a straightforward application of FindMin. 
Recall (the Introduction) that we assume that every node knows and the list of the edge weights of its 
incident edges and that the edge weights of all edges are distinct. 

The goal is for each node to mark a subset of its neighbors so that the resulting marked edges form 
an MST. The algorithm is an implementation of Boruvka’s parallel algorithm for constructing an MST. 
During the execution, the nodes are partitioned into fragments, each a connected component of the final 
MST. (Initially, each node is a singleton fragment). At each round, in parallel, a minimum weight edge 
incident to each non-maximal tree (fragment) is found by a search started by the fragment leader. 

Electing a fragment leader is straightforward and is similar to a broadcast-and-echo and ideas in |18] : 
Since this is a synchronous network, all nodes know when an iteration starts and thus when to start the 
leader election. Moreover, every leaf of a fragment knows it is a leaf and so should start. Each leaf acts 
as if it has just received a broadcast message initiated by the leader (though the leader is not known yet). 
That is, the leaf sends an echo message to its (only) tree neighbor - thus designating that neighbor as its 
parent. As in broadcast-and-echo, every internal node who received an echo from all its neighbors but one, 
sends an echo to that last one. It is then easy to see that either the tree has one median or two. In the 
first case, the echoes converge to that median. Let us elect this one the leader. In the second case, there 
are two neighboring medians. Let the one with the higher identity be the leader. 

Let maxTimeMST(n) be the maximum amount of time needed to carry out Steps (a)-(c) in a tree of 
size n. We assume a global clock with value time. Let C be the (constant) probability that FindMin — C 
returns the minimum edge incident to a tree, if there is one. Let c in the algorithm below be the desired 
(constant) parameter, such that the probability of success of the Build MST algorithm should be 1 — n~^. 

Build MST {executed by every node x} 

1. time •(— 0 

2. For i = 1 to (dOc/C) [Ig n]: 

(a) Elect a leader in 

(b) If X = leader then x initiates FindMin — C; else x participates in FindMin — C. 

(c) If X is an endpoint of the edge {x,y} which has been returned by FindMin — C, x sends 
Add_Edge message to y across {x,y}. 

(d) While time < i * maxTimeMST(n) wait; while waiting, if any Add_Edge message is received 
over an edge, mark that edge. 

Lemma 3. Let c be any constant, c > 1. With probability 1 — n~‘^, Build-MST constructs an MST in 
time and message complexity 0(nlog^ re/loglogre). 

®up to some constant factor. 
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Proof. We call each for-loop a phase. At the start of each phase there is a forest of trees consisting of all 
marked tree edges. If there is an edge leaving the tree in the the graph of all edges, it is a non-maximal 
tree. The variable maxTime{n) is set so that every node enters phase i at the same time, after completing 
phases j < i. We first show: 

Claim 1: After (16c/C')lgn phases, the number of non-maximal trees is no greater than (8c/C')lgn with 
probability 1 — 1/nP. 

Proof of Claim f.Tix a phase in which the number F of non-maximal trees is greater than (8c/C')lgn. 
For each Tj which is non-maximal at the start of the phase, let Xj = 1 if the execution of FindMin 
returns the minimum weight edge incident to Tj and 0 otherwise. Then each Xj is an independent random 
variable with constant probability C of success. Using Chernoff bounds we can see that at least (7/2 of 
these FindMin’s succeed with high probability: Prif^Xj < {C/2)F) < exp(—((l/2)^(7(8c/(7) Ign))/2) < 
l/rF. Thus, w.h.p., the number of non-maximal components is reduced by a fraction of (7/2 in each 
phase, until fewer than (8c/(7) Ign non-maximal trees remain. This requires no more than lgn/lg((7/2) = 
(7(logn) phases. Each phase executed by a tree of size s uses a number of messages (7(s log n/log log n) or 
(7(n log n/log log n) over all non-maximal trees. Now we show: 

Claim 2: If there are c'\gn non-maximal components to start, then after {2c' + 8c)/(7)lgn more phases, 
there are no non-maximal trees, with probability 1 — 1/rC. 

Proof of Claim 2: For each phase that there is a non-maximal tree which successfully runs FindMin — C, 
the number of non-maximal components is reduced by one. We call a phase successful if there is at least 
one successful run of FindMin — C. For any c, after ((2c' -|- 8c)/(7) Ign phases with at least one execution 
of FindMin each, at least c'lgn — 1 FindMin’s will be successful with probability at least 1 — 1/n'^. 

Putting these claims together: Let d = {Sc/C). Then (16 -|- 24)(c/(7)lgn phases suffice to reduce the 
number of non-maximal trees from n to 0, with probability > 1 — n'^. We conclude the proof of the lemma 
by observing that (7(logn) phases require a total of (7(n log^ n/log log n) messages and time. ■ 

4 Unweighted edges 

We now present analogous results for unweighted graphs using less costly, somewhat different techniques. 

4.1 Find any edge leaving a tree 

FindAny, presented below, uses an expected constant number of broadcast-and-echoes, to find any edge 
leaving Tx. Thus in expectation, we save a factor log re/log log re in the asymptotic cost of FindMin. 

The procedure starts with HP — TestOut to determine if there is an edge in the cut w.h.p. If 
HP — TestOut returns 1, a routine to find such an edge with a constant probability of success is run 
repeatedly until such an edge is found, yielding a constant expected time and message procedure. To 
achieve a probability of error re“'^ in the running times claimed, we assume x knows an e(re) < l/(2re'^) 
where e~^(re) is polynomial in re. T below is Tx. We let [r] denote the set {0,1, ...,r — 1}. 

Find Any {x) 

1. Count ■<— 0. 

2. X initiates HP — TestOut in T with error parameter e(re). If ^{HP — TestOut) then return 0. 

3. Determine the identity of an edge as follows: 

a) X broadcasts a random pairwise independent hash function h : [l,maxEdgeNum{T)] —>■ [r] where 
r is a power of 2 > sum of degrees of nodes in T. 

b) Each node y hashes the edge numbers of its incident edges nsing h, and computes the vector h{y) 
s.t. hi{y) is the parity of the set of its incident edges whose edge numbers hash to values in [2*] for 
i = 1,..., Ig r. If y has no incident edges then h{y) = 0. 

c) The vector h{T) = h{y) is computed up the tree, in the broadcast-and-echo return to x. 

Then x broadcasts min = min{i | hi{T) = 1}. 


d) Let E{x) be the set of edge numbers of edges incident to x. Each node x computes w{x) = 
©{e I {e € E{x) A h{e) < 2™"'} and w{T) = is computed up the tree in the broadcast 

echo and returned to x. 

{If there is exactly one edge leaving T with h{e) < 2”^*", then w{T) is its edge number.} 


4. Test: x broadcasts w{T) to obtain Sum = the number of endpoints in T incident to the edge given 
by w{T). The test succeeds iff Sum = 1. 

5. If Test succeeds, return w{T) else 
for TestOut — C, return 0; 

for TestOut, if Count > 161n(e“^(n)) then return 0; else increment Count and repeat steps 3-5. 


Proof of correctness 

Let h a 2-independent function from a universe U into [2^] for some I >2. Let W C U s.t. 0 < \W\ < 2^~^. 

Lemma 4. With probability 1/16, i/ |1E| > 0 then there is an integer j sueh that exaetly one w £ W 
hashes to a value in [2'^]. 


Proof. We prove the statement of the lemma for j = i — [IglXl] — 1. Then 1/(4|1E|) < 2-^72^ < 1/|1T|. 
Now 

Pr [3! tr € IE : h{w) G [2-^]] 

= ^ rPr [h{w) € [27 A Vtr' € IE \ {x} : h{w') 7 [27] ) 

WGW ^ 2 


wGW 



€ 



Pr [Vrc' £W \ {re} : h{w') 

h 


<t [2^1 I h(w) £ [2»]] 


> Pr [h(w) £ [V]] 1 - Pr £ [2^ | h(n,) £ [2^|] 


wGW 


to'sruxitii} 


y 7 ( Pi" [hiw) G [27] ( 1 “ ^ Pr [h{w') £ [27] 1 1 by 2 -wise independence 


wGW 


to'£rU\{a:} 


|1E| (^ 272 ^ • (1 - (jWl - 1)2V27) > |1E|/(4|X|)(1 - |1E|/(2|1E|) = 1/16. 


Lemma 5. If there is an edge no leaving Tx, then EindAny{x) and EindAny — C{x) return 0. Otherwise, 

• EindAny{x) returns an edge leaving Tx w.h.p. It uses expected time and messages 0{n); and 

• EindAny — C{x) returns an edge leaving Tx with probability at least 1/16, else it returns 0. It uses 
worst case time and messages 0(n). 

Proof. Let W be the set of edge numbers of edges leaving T. If |1E| = 0, then HP — TestOut returns 0 
and X returns 0. If |1E| > 0 then with probability > 1 — l/{2n)^ HP — TestOut succeeds and x continues 
to Step 3. Given x goes on to Step 3, by Lemma 01 with probability at least 1/16, there is a j such that 
exactly one edge with distinct edge number e in VP hashes to a value in [27 (“Event A”). Because all edges 
incident to T which are not leaving T have both endpoints in T, their edge numbers ^ to 0, when summed 
over T. Hence, when Event A occurs, ^jiv) ~ ©e'eVl/7 (®7 = 7(^) ~ — /■ However, 

©j^gr hmin(y) = 1 implies that there is at least one edge number in IE which hashes to [2™”] C [ 27 , so 
we conclude that there is exactly one such edge number e G IE hashing to [2”^*”]. 

When Event A occurs, in Step 4, w{T) = e and Test succeeds in Step 5 and an edge leaving T is 
returned. Thus the probability of success of EindAny — G is the probability that HP — TestOut succeeds, 
followed by Event A which is > 1/16 —l/(2n'^). In EindAny, if HP — TestOut succeeds, then Steps 3-5 are 
repeated up to 161n(e“^(n)) = 161n(2n‘^) times until they succeed. The probability of failure of all these 
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repetitions is < (1 — < l/(2re'^). The total probability of failure is therefore no more than 

this probability plus the probability of failure of HP — TestOut for a total probability of failure < 1/n^. 
The expected number of repetitions of Steps 3-5 until success is 16 (and the worst case is O(logn)). 

Since a single run of Steps 1-5 requires 0{n) time and messages, the lemma follows. 


4.2 Building an ST 

This algorithm is obtained by modifying the algorithm for building of the MST. Two modifications are 
necessary. The first is the substituting of Find Any — C for FindMin — Cm. each. The replacement of the 
0(n log n) FindMin — C by FindAny — C reduces the asymptotic costs by a factor of log n/log log n. 

The second is more subtle. In FindMin, when all fragments pick minimum weight edges leaving them, 
all of them are MST edges. This is because the weights of the edges are distinct, there is only one minimum 
weight edge leaving any fragment. Moreover, such an edge must be in the MST. When k fragments of the 
unweighted graph pick edges leaving them, it is possible that k distinct edges are picked and (at most one) 
cycle is formed, by “potential” tree edges. This needs to be detected before the next phase begins. 

If all nodes run the leader election algorithm described in Section 4.3, the nodes on the cycle will 
be exactly the set of nodes which fail to hear from all but two of their neighbors. After the maximum 
time needed for leader election, these nodes will be aware they are on a cycle. Moreover, they know their 
neighbors in the cycle, since they have not heard from them. Each node randomly picks one of the two 
edges incident to it in the cycle to exclude and sends a message along that edge to its other endpoint. If 
some edge is picked by both its neighbors, then this edge is unmarked, i.e., not added to the tree. Leader 
election is again run to test if there is a cycle. If there still is a cycle, all of the edges in the cycle are 
unmarked and not included as tree edges in the next phase. 

The analysis of this algorithm appears in Appendix [Bj Intuitively, beyond the analysis of Build MST, 
it shows that an edge is likely to unmark (breaking the cycle) in high probability. Note that at most of 
half of the chosen outgoing edges are unmarked, so “enough” mergers still occur. This ensures progress. 
If the cycle is small and not edge is marked, then the whole cycle is removed. Since the (removed) cycle is 
small, still “enough” mergers occur. This establishes the following lemma. 

Lemma 6. Let c he any constant, c > I. With probability 1 — n~^, there is an algorithm which constructs 
an ST in time and message complexity 0(nlog n). 

4.3 Repairing an ST 

This is a straightforward adaptation of the methods used for repairing an MST, except that FindAny is 
used in place of FindMin and a factor of log n/log log n is saved from the asymptotic cost. 

5 Open problems and conclusions 

We have adapted a technique from streaming and dynamic sequential graphs to find a surprising result, 
that the problem of constructing a broadcast tree can be done with 0(n log n) messages (and time) in the 
CONGEST model w.h.p., a problem believed to have a lower bound of Ll{m) on the number of message 
for 25 years or more. (In a model allowing much longer messages, it was known how to avoid sending 
messages over some edges [H]; intuitively, [1| showed that for each such avoided edge, the identity of one 
of its endpoints needs to be delivered uncompressed to the other endpoint; we have shown that those 
identities could be compressed significantly if the ID space is of a reasonable size, up to even exponential 
in re). We also have shown a very simple way to repair ST’s and MST’s in 0[n) and 0(re log re/log log re) 
expected time and messages; previously it was suggested that reducing the message complexity to o(rre), 
requires auxiliary information to be stored between updates. By avoiding the need to (store and) distribute 
auxiliary information, we also manage to make the o(rre) message complexity worst case rather than just 
amortized as in previous papers. Can these yield practical methods for real dynamic networks? 
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A number of interesting theoretic problems remain. For ST and MST construction, Can ST be con¬ 
structed by a deterministic or Las Vegas algorithm in o(m) messages in the Ki model? What kind of 
bounds need the nodes know of n? Can these results be made to work in the asynchronous model of 
communication? Is it possible to form an ST in time less than O(nlogn) with o{m) messages? Finally, are 
0(n log n/log log n) messages required for FindMin or can this be pushed closer to the cost of FindAny? 

Acknowledgment: We would like to thank Moni Naor, Copal Pandurangan, and Ely Porat for useful 
comments. 
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A Accommodating superpolynomial sized edge weights 

Suppose the maximum edge weight has w bits where w is the message size. We show that 0(log n/ log log n) 

broadcast-and-echoes suffice. In the previous subsection, the logn-wise ’’pivots” were chosen obliviously. 

Here, we use pivots based on randomly chosen edges. 

Let d be the total number of endpoints of nontree edges incident to the tree. Let k = y^logre/ log log re. 

The Sample{p) routine described below returns r sequences of w/k bits from randomly sampled edges with 

prehx p. These edges are nontree edges with one or two endpoints in the tree, such that each non-tree edge 
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with prefix p incident to Tx is picked with probability 1 /m or 2/m where m is the total number of such edges. 

FindMin{x) {finds minimum cost edge in {Tx,V \Tx)} 

1. X sets j ■<— 1; /c ■(— rc/Zc, P = 0, and announces start, and sends “start” to initiate with TestOut{x, j, k). 

2. If HP — TestOut{x, j, k) = 0. x broadcasts “stop” and returns 0, 

3. while j < k repeat; 

{Loop 1} 

4. X broadcasts one O(logn) bit low probability odd hash function / and one high probability hash 
function F. 

5. Run Sample{j, k) 

6. In parallel for i = 0,1, 2, w/r, run TestOut{x,p ■ ji ■ 0,p ■ ji+i ■ 0) using / on each interval where 
Jo = J and = k, and other jVs are given by Sample{p). Assume they are ordered by value. 
{End Loop 1} 

7. Let min be the minimum i s.t. TestOut{x,jip ■ ji,p ■ ji+i) = 1. 

{Verify that there are no edges with weights with lower prefixes which leave the tree by testing; } 
HP — TestOut{x,jip ■ ji,p ■ ji+i — 1) = 1. If there are rerun the previous step to recompute the 
minimum. 

8. {continue to look for an extension of p or a single edge} 

If jmin = jmin+i extend p to p • jmin, set j to {0}"'/” and k = {1}'"/*'. 

9. Else set j to jmin and k to jmin+ii broadcast these values. 

10. return the edge given by augmented weight j. 

Let my be the number of nontree edges incident to node y whose weights have property P. Let 
mT = YhyiET'^^y I^®I ^ multiset of such edges where an edge appears twice if both its endpoints 

are in T. 

Sample(j, k) {returns the prefixes of r edges drawn uniformly at random from S whose weights are in the 
range [p-j,p- k].} 

To implement Sample{p, j, k), x initiates a broadcast-and-echo to its tree. Consider the tree rooted at 
x. On the echo; each node y determines and stores m^ over all 2: in the subtree rooted at y. Starting 
with the leaves each node y passes up this sum, adding on my. 

Let each node arbitrarily order its children. Then to sample r elements, x randomly determines how 
many samples come from itself and from its children by drawing r random numbers in the range from 
{1, total}. It randomly chooses the samples from itself, and sends the the number of requests which fall 
into the range of each child’s edges to the child, which repeats this procedure. This requires only log r bits. 
Another echo returns the k samples in parallel as each node affixes its choices. No more than r prefixes 
are sent to the root in total , of size w/r so that they all ht in one message. 

Lemma 7. W.h.p., the lightest edge is in the weight interval (p ■ j,p ■ k) or there is no such edge and the 
algorithm returns 0. 

Proof. Initially this is true. If there is no such edge and HP — TestOut{j,k) returns 0 w.h.p. and the 
algorithm returns 0. 

Assume it’s true at the start of the loop. Then TestOut must return a 1 for some interval or it is 
rerun. If it returns a 1 then the interval tested must contain an edge leaving the tree; the lighter intervals 
are tested w.h.p. to conhrm they have no lighter edge leaving the tree. Therefore the interval min must 
contain the lightest edge. If jmin and jmin+i agree then p • jmin must be the prefix of the lightest edge 
weight. Each high prob. test has probability of failing of l/n'’. There are only a constant number per 
iterations, hence by a union bound they all succeed w.h.p. 
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Theorem A.l. In a tree whose nontree edge weights are of length w bits, there is an asynchronous 
algorithm to find the lightest nontree edge leaving the tree in 0(logn/loglogn) expeeted broadeast-and- 
echoes with message size w. 

Proof. Correctness follows from the lemma. We first examine the number of iterations of loop 1. 

We first note that loop 1 terminates when TestOut succeeds in the interval containing the lightest edge. 
This happens with constant probability. Hence it repeats a constant number of expected times. 

Consider the edges (and possible duplicates) in S ordered by weight. With each sampling, there is 
a constant probability that a sample edge will be chosen which is within mr/'c of the lightest edge on 
either side of the ordering, or there are fewer than mr/r such edges. Hence if their prefixes are different, 
there is a constant probability that S in the next round has size 2mT/r. The number of these “successful” 
samplings needed to shrink the number of such edges to less than r is logrUriT- The expected number of 
samplings to achieve this many successful rounds is 0 {logrmT) < log(n^)/logr = 0(logre/loglogre). 

On the other hand, if the prefixes are the same, then we extend the prefix another w/k bits. The 
maximum number of these samplings is w/{w/r) = r = y^log n / log log n. ■ 

B Analysis of Lemma [6] 

We analyze Build ST by modifying the analysis of Build MST. We observe that with probability at least 
1 — 1/2^“^ for a cycle of size k, at least one edge is unmarked and there is no more cycle, while no more 
than half the edges in the cycle are unmarked. If an edge is found in the cycle that can be unmarked then 
if the number of fragments would have dropped by a factor of (7/2 in the analysis of Build ST, in Build 
MST it drops by at least a factor of (7/4 and the analysis is not very different. 

Suppose no edge in the cycle is found and all edges in the cycle are unmarked. As in the analysis of Build 
MST, there are two cases. The first is when there are at least c'lgn fragments: If the cycle involves less 
than half the fragments which successfully found edges then the number of edges which become unmarked 
is less than half the total number of edges which were marked, and again, the number of fragments drops 
by a factor of (7/4 instead of (7/2. The case where the edge unmarking does not break the cycle but it 
involves at least half the c' Ig n fragments (so the whole large cycle is removed) happens with probability 
less than than c' > 2c+ 1, w.h.p., this case does not happen. 

The second case is when the number of fragments is less than c'lgn. There is a probability of at least 
(7 that at least one fragment will find an edge leaving, and given that this happens, there is a probability 
of at least 1/2 that if a cycle is formed, it will be broken, so that at least one new marked edge is added 
during the phase with probability (7/2. Following the analysis similar to Build MST, after (7(nlogn) more 
phases, the ST tree will be formed w.h.p. 
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